Using Response Functions for Strategy Training and Evaluation

نویسنده

  • Trevor Davis
چکیده

Extensive-form games are a powerful framework for modeling sequential multiagent interactions. In extensive-form games with imperfect information, Nash equilibria are generally used as a solution concept, but computing a Nash equilibrium can be intractable in large games. Instead, a variety of techniques are used to find strategies that approximate Nash equilibria. Traditionally, an approximate Nash equilibrium strategy is evaluated by measuring the strategy’s worst-case performance, or exploitability. However, because exploitability fails to capture how likely the worst-case is to be realized, it provides only a limited picture of strategy strength, and there is extensive empirical evidence showing that exploitability can correlate poorly with one-on-one performance against a variety of opponents. In this thesis, we introduce a class of adaptive opponents called pretty-good responses that exploit a strategy but only have limited exploitative power. By playing a strategy against a variety of counter-strategies created with pretty-good responses, we get a more complete picture of strategy strength than that offered by exploitability alone. In addition, we show how standard no-regret algorithms can me modified to learn strategies that are strong against adaptive opponents. We prove that this technique can produce optimal strategies for playing against pretty-good responses. We empirically demonstrate the effectiveness of the technique by finding static strategies that are strong against Monte Carlo opponents who learn by sampling our strategy, including the UCT Monte Carlo tree search algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Compensatory and Rehabilitative Cognitive Training Improves Executive Functions and Metacognition

Purpose: Executive Functions (EF) improvement is considered as a pivotal axis in cognitive rehabilitation and enhancement according to the studies. Scholars believe that EF can be practiced and improved as a way to ameliorate cognitive ability. The main objective of the present paper is to boost executive functions and meta-cognition via compensatory and rehabilitative cognitive training. Metho...

متن کامل

Restoring Motor Functions in Paralyzed Limbs through Intraspinal Multielectrode Microstimulation Using Fuzzy Logic Control and Lag Compensator

In this paper, a control strategy is proposed for control of ankle movement on animals using intraspinal microstimulation (ISMS). The proposed method is based on fuzzy logic control. Fuzzy logic control is a methodology of intelligent control that mimics human decision-making process. This type of control method can be very useful for the complex uncertain systems that their mathematical model ...

متن کامل

Mixture of Xylose and Glucose Affects Xylitol Production by Pichia guilliermondii: Model Prediction Using Artificial Neural Network

Production of several yeast products occur in presence of mixtures of monosaccharides. To study effect of xylose and glucose mixtures with system aeration and nitrogen source as the other two operative variables on xylitol production by Pichia guilliermondii, the present work was defined. Artificial Neural Network (ANN) strategy was used to athematically show interplay between these three c...

متن کامل

Evaluation and comparison of cognitive flexibility, selective attention and response inhibition in male and female bilingual and monolingual students

In different parts of the world, people speak different languages ​​to each other. Some parts of the world are more linguistically rich and more than one language is spoken in those regions. The aim of this study was to evaluate and evaluate the executive functions of the brain including cognitive flexibility, selective attention and response inhibition in monolingual and bilingual male and fem...

متن کامل

Examining the effectiveness of initial response training program for nuclear emergency preparedness

Background: Although nuclear technology has various beneficial, it also has a variety of risks. In particular, initial response is very import to respond to risks. Therefore, the program to increase initial response proficiency can be regarded as very essential. The Republic of Korea annually conducts more than 10 nuclear emergency response training programs, and specialized training courses f...

متن کامل

Efficient Optimum Design of Steructures With Reqency Response Consteraint Using High Quality Approximation

An efficient technique is presented for optimum design of structures with both natural frequency and complex frequency response constraints. The main ideals to reduce the number of dynamic analysis by introducing high quality approximation. Eigenvalues are approximated using the Rayleigh quotient. Eigenvectors are also approximated for the evaluation of eigenvalues and frequency responses. A tw...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015